PP Attachment Ambiguity Resolution with Corpus-Based Pattern Distributions and Lexical Signaturese
نویسندگان
چکیده
We propose a method mixing unsupervised learning of lexical pattern frequencies with semantic information which aims at improving the resolution of PP attachment ambiguity. Using the output of a robust parser, i.e. the set of all possible attachments for a given sentence, we query the Web and obtain statistical information about the frequencies of the attachments distributions as well as lexical signatures of the terms on the patterns. All this information is used to weight the dependencies yielded by the parser and eventually to choose of the most probable attachment.
منابع مشابه
Lexical and referential influences on on-line spoken language comprehension: A comparison of adults and primary-school- age children
This paper reports on two studies investigating children’s and adults’ processing of sentences containing ambiguity of prepositional phrase (PP) attachment. Study 1 used corpus data to investigate whether cues argued to be used by adults to resolve PP-attachment ambiguities are available in child-directed speech. Study 2 was an on-line reaction time study investigating the role of lexical and r...
متن کاملDisambiguation of English PP Attachment using Multilingual Aligned Data
Prepositional phrase attachment (PP attachment) is a major source of ambiguity in English. It poses a substantial challenge to Machine Translation (MT) between English and languages that are not characterized by PP attachment ambiguity. In this paper we present an unsupervised, bilingual, corpus-based approach to the resolution of English PP attachment ambiguity. As data we use aligned linguist...
متن کاملCorpus Based PP Attachment Ambiguity Resolution with a Semantic Dictionary
This paper deals with two important ambiguities of natural language: prepositional phrase attachment and word sense ambiguity. We propose a new supervised learning method for PPattachment based on a semantically tagged corpus. Because any sufficiently big sense-tagged corpus does not exist, we also propose a new unsupervised context based word sense disambiguation algorithm which amends the tra...
متن کاملRelative clause attachment ambiguity resolution in Persian
The present study seeks to find the way Persian native speakers resolve relative clause attachment ambiguities in sentences containing a complex NP of the type NP of NP followed by a relative clause (RC). Previous off-line studies have found a preference for high attachment in the present study, an on-line technique was used to help identify the nature of this process. Persian speakers were pre...
متن کاملContribution of Complex Lexical Information to Solve Syntactic Ambiguity in Basque
In this study, we explore the impact of complex lexical information to solve syntactic ambiguity, including verbal subcategorization in the form of verbal transitivity and verb-noun-case or verb-noun-case-auxiliary relations. The information was obtained from different sources, including a subcategorization dictionary extracted from a Basque corpus, the web as a corpus, an English corpus and a ...
متن کامل